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a measure pf*the usefulness of a pass/fail testing 
decision procedure is the ratio of the utility^of the given procedure 
to the utilit]^ Qf a procedure \based on Jcnow^-edge of scores on a 
criterion measure It is /z^oapuWd from scords^ f pr a representative ^ 
sample of* persons tested. Utility functionis may be specified by the . 
test user or set by cdjnvention *o be linear with uniV^ope. The 
utility ratio c^n be u^e*d for comparing tests or for selecting test 
items. . (Siitfeor) 
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Consider the' following^ situation: A deci^on-maKer intends to use a 
test to make a decisifon about each of many persons. For each person the 
decision-maker -must take one of twcT possible actions, which we will call 
'^'action A'' and "action *R". The decision-maker considers action A more 
appropriate for persons who score high on the test and *iaction R more ap- 
propriate for low scorers; the letters A and R might stand for "accelerated 
program" and "regular program", or "award credit" and "refuse credi^ft", or 
simply "accept" ^nd "reject". The 4ecision-maker will choose one specific 
point on the test-score continuum as the cutoff point; all persons with 

test scori^s at or above this point will receive action A, 'whil^all those 

1 ' - • * ' vr 

with test scores below tfte cutoff will receive action R. We wiil use the 

/ „ ^ ^ ' ^' 

^symbol x to refer to this cutoff s^core. We will restrict our attention to 
o 

the situation in which there are no constraints on the numbers of persons 
assigned to actions A 'and R. ^ ^ . 

Now let us suppose thai the decision-m^aker would like to validate this 
decision procedure against a criterion me|^ure (either concurrently or predic- 
tively) , by administering the criterion 4neasure to a representative sample of 
persons who have taken the\est. The higher a person's score on the cjfllrerion 
measure, the better the result of action A for that person *and the worse the 
tesult of action R. At some point on t^ criterion-measure scale, the decision 
maker would be yundecided between actions A and Ri We wljl' use the symbol y^ *t 

^ \ . ' • 

refer to thiS' Indbif f erence point. * - 



^ The choice of y Is logically prior to tfhe choice of . Procedures for 

optimizing the choice -of for a given value of y^ .are discussed by Davis, 
•Hickman, and Nov! ck (1973). • * ^ 



We can express our decision-maker 's feelings mat^ematicaliy in the form., 
of tUo utility functions. Let represent the score o^ person i on the crliXe- 
rion measure, ,Then let \ ^ 

u (y.) • utility of action A for person i, 
^ * a i ^ ^ \ 

■ \ ^ ■ 

^ u^(.y^) ■ utility of action R for person i 



where u (y ) « u (y ) - 0. That is, the zero on the utility scale is defined 
a o r o 

to be the value of either action at the\indiff erence point- We asSAime that 

■ ^ ^ • ^ J." . 

u Is an increasing function and u a decreeing function, to reflect th^ 
greater importance 'of correct decisions about persons whose criterion perfor- 
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mance is farther from the indifference point. Figure 1 presents ati exampae 



of a possible pair of utility functions. (Note that the criterion measure is 
plotted alo^g the horizontal axis.) C, 

• - • • ' ■ ■ . , ... 

. Gur decision-maker intends to usfe a decision procedure that can be ex- 
pressed mathematically as follows: Letx be the test score of person i and 
let X be the minimum passing score. The^n we take action for person i if x^ ^ 

and action R < ^® uti>ity of this decision procedure is the ; 

sume of the utilities of all the Individual decisions: 

'^i i.^'o '^i ^ ''o 

As a standard for comparison we have the utility of the ide^ decision procedure 
based^on knowledge of each person's performance on the criterion measure: 



—1 ■ , „. 

^ This feature of the situation distinguishes it from the "threshold utility" 
situation examined by Hambleton and Novick (1973) and by Petjersen (1974-). 



Then the validity of the* decision procedure based on x > as comparred tq the 
validity of Irbe /'ideal" decision procedure baled %x y^ , can^e described by 
th'e ratio " ' 



U(x^,'y^) - - 



U(y^) 



3 

X and the indifference point y to emphasize its dependence on the choice 

vO ' * P 



This utility ratio ^s expressed as a functd'on of the minimum passing score 



r 

of jx and y^ ^ ' ' ' • ^ • . . 

. ^ . • i 

Because the, denominator -of^he utility ratio ila -the maximum utility over 

J . ^ " . ^ ' 

all possible bets of decisions — the iltility of a correct decision for every 

person — the utility ratio reaches its maximum at 1. The minimim v^ue for 

the utility ratio, is not necessarily, -1 unless u^(y) - ""^^y^ 

all values of y. The utlMty ratio equals zero when the harm from t^e bad 

decisions exactly balances the benefit from the good decislbns. A negative 

utility ratio ''indicates tlSat the^ decisipn procedure could have been improved 

by, taking action A for the ^low scorers and action R ^or the high scorers, ^p^is 

situation would be expected if the test were accidentally reverse-scored^) 

One type of utility function that ia of particular interest because of its 
simplicity-and intuitive appeal is that represented by fltraight lines. Let b^ ^ 
b^M:he benefit of accepting a person one unit above y^ on the criterion measure 
and let c be the cost of rejecting that person. Similarly, let b^' be the 
benefit of rejecting and c be the^cost of accepting ^ parson one unit below 

■ ■ ' V _ V 

^ This ratio does not correspond to the "utility ratio" defined for the threshold 
Utility case by Peterden (197A). Petersen's utility ratio does not depend on 
observed data, but merely describes the utility functions. 
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^;^p.on the criterion measurel - • Then let " ■ ',■ 



J . c- (y - y ) if y < • ' ' 

u^Cy) - -b^ (y - y^,) if y < y^^ ' " , / 

Utility .functions of this form imply that^the cost of a bad .decision is pro- 
portional to*the size of the terror. Similarly, they imply that t^ve benefit - 
from a good Secision is pxoportional to the size of thje siTOt that- was avoided. 



The size of the error made or avoided is the abrfoluteS?alue"p^' (y. - y ) 

These utility functions could be described as "semi-linear";- Ehey become fully. 

linear when b ■ c and b ■ c . 7 . « 

a a r . r / , 

[ ^ # ' ' . . ■ r 

"Figure 2 Illustrates a pair of utility functions of this form. ' Only the 

relative sizes of b , c , b , and c affect the value of the utility ratio, as 

a a r ' - r ^ 

can be seen by .multiplying all four coefficients by any const^ant k. This mul- 
tiplication would have the effect of multiplying both utility functions by k. 

' . ' I ' • • ■ 

Therefore the numerator and denominator of the utility ratio would both be 



multiplied by k, leaving its value unchanged. 



What is the expected utility of a decision procedure in which actions A 
and R are assigned purely at random? Is it necessarily zero? Let p-* and p 

t Si T 

be the probabil^ies of assigning actions A and R, respectively/^^ Then the 
expected utility of the ^decision procedui;e is \ ^ 

[p u (y.) + p u (y,)] ' ' ' 

la^'l' • r r '1 



•I 

all 1 



^ The coefficients b , c , b , and c correspond to. Petersen's (1974) utility 

a r r a 

values a, b, c, and d, respectively , except that iti Petersen's approach they 
are not multiplied by the size of the error. ' , , 



i 



1 " 



This expression will equal zero when the Jutlllty functions are such that 

^ / ) ■ P / P for all values of y. If the utility 

a ' ^ r' ' *^a ^ 

JEui^ctions are fully lia^ar, 

/ 

. V ^> " ^a ^^J' ^ ^o^ 



\ (y - y^) / 



then the expected utility of the random decision procedure will be pdsitive 
when . , • 



I.- 



[P3 b .(y-y„) -p^b^ (y-y„)l>0: 
all i . . 



that is , when - (p - p b ) * has the same sign as (y - y ) 
• a a r r * * *j ^ 

Therefore, if .the average score on the criterion measure were £ar enough 
above the indifference point and the benefit or harm from action A Sufficiently 
greater than that from action R, the decision-maker would do reasonably well by 
taking action A for all persons. (This exampl§"^ hewa fc l^g^mportance of the 
requirement that, the validation sample be representative of the group of 
persons about whom decisions are to be made.) 

Why should a test user such as, the decision-maker described at the begin-^ 
ning of this paper use the utility ratio for evaluating hia test-bas"fed decision 
procedure on th^ basi3 of a *crltterion jneasure? Wouldn't one of the more familiar 

s ^ ^ . * ' 

correlation-like statistics serve his purpose Just as well? ^No, because none of ' 

'* 

-the more familiar statistics uses all that informatlQn and only that Information 
that the declaion-maker actually uses in making his decisions and evaluating 
their results. The utility ratio treats the testi score aa a dichotomous variable 
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c ■ ' ' 

'because the test score being used as a dichotomous variab^^ At the same 
time, it does ndt impose an unnecessary dichotomy on the criterion measure, as 
do the phi-coefficient ^nd the per-cent -agreement statistic. While it treats 
the criterion measure as cpntinuous, it takes into account the Indifference 
point that f orrhb a natural zero for the criterion measure and thus makes it 
a meaningful ratio scale. Finally, it allows the decision-maker to adopt 
whatever utility functions best reflect his values. 

Traditionalists may object that a utility-based approach to test validation 

* * * * *' 

ailowe the decision-maker too much freedom to Influence the value of the resulting 
coefficient. This objection can be overcome by establishing a convention of comr 
puting utility ratios on the basis of fully linear utility functions wfth equal 
slopes: u (y) - y - y > u Xy)« -(y y^) • M administrator or researcher who 
.piy)poses a different set of utility functions in -a particular ^uation would then 
be obligated to show why the utility functions he advocates are more appropriate . 
than those established by convention.^ ^ 

The 'most obvious use of the utility ratio is for comparing • two or more tests. 
However, it also offers a practical alternative to the use of traditional discrim- 
ination indices for selecting test items for a test intended to discriminate at a 
particular level of ability (either on an external criterion variable or/ on the 
test itself). It also allows the test constructor to specify the relative impor- 
tance of identifying qualified versus unqualified examinees. I^et » 1 If the 
examinee answers the itep correctly and - 0 if he does not. Then if an exter- 
nal criterion variable ±3 used as the basis for item seHction, the utility ratio 

^ Notice that whenever we use a traditional product-moment correlation to 
validate a test, we implicitly accept the convention that the utility gf 
test score x for^ a person with criterion value y is given by the product 

. *(x - 50(y - y). , . 
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for a given' itSm would be 



> y 



If scores on the test itself were used as the basis for item selectipn, the 
test constructor would have to specify utility functions in term^'^of test 
scores'. In this case the y's in the above formula would ref er xo scores on 
the full test; y^ would represent the score level at which maximum discrimi- 
.nation is desired. * ^ 
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